Ordering in spatial evolutionary games for pairwise collective strategy updates 
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Evolutionary 2x2 games are studied with players located on a square lattice. During the evolution the 
randomly chosen neighboring players try to maximize their collective income by adopting a random strategy 
pair with a probability dependent on the difference of their summed payoffs between the final and initial state 
assuming quenched strategies in their neighborhood. In the case of the anti-coordination game this system 
behaves alike an anti-ferromagnetic kinetic Ising model. Within a wide region of social dilemmas this dynamical 
rule supports the formation of similar spatial arrangement of the cooperators and defectors ensuring the optimum 
total payoff if the temptation to choose defection exceeds a threshold value dependent on the sucker's payoff. 
The comparison of the results with those achieved for pairwise imitation and myopic strategy updates has 
indicated the relevant advantage of pairwise collective strategy update in the maintenance of cooperation. 
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I. INTRODUCTION 

Most of the games represent simplified real-life situations 
and help us to find an optimum decision (action). Due to the 
simplifications the players have only a few options to choose 
and the corresponding incomes are quantified by a payoff ma- 
trix allowing us to apply the tools of mathematics. The theory 
of games has been used successfully both in economics and 
political decisions since the pioneering work of von Neumann 
and Morgenstern [1]. Subsequently the concept of payoff ma- 
trix is adopted by biologists to quantify the effect of interac- 
tions of species on their fitness (characterizing the capability 
to create offspring) in the mathematical models of Darwinian 
evolution Q], Since that time the evolutionary game theory 
provides a general mathematical framework for the investiga- 
tion of multi-agent systems used widely in economy and other 
social sciences where imitation is substituted for the offspring 
creation JH-H- 

In traditional game theory selfish and intelligent individu- 
als try to maximize their own payoff irrespective of others. In 
evolutionary game theory players repeat the games and some- 
times imitate (adopt) the neighbor's strategy if the neighbor 
received a higher score. It turned out that assuming local 
interactions among players the imitation supports the main- 
tenance of altruistic behavior even for Prisoner's Dilemma 
games where the individual interest is in conflict with the com- 
mon one and the selfish individual behavior drives the well- 
mixed community into a state (called "tragedy of the com- 
mons") with players exploiting (instead of helping) each other 

In parallel with theoretical investigations game theory is 
also used to study human and animal behaviors experimen- 
tally ifTol — TTvtl . These experiments have motivated the exten- 
sion of evolutionary games to study the effect of different 
types of mutual help, e.g., charity [18], inequality (inequity) 
aversion ITsil-plTl . emotions l22ll including juvenile-adult in- 
teractions B23I1 . In generally, the modelling of human decision 
dynamics is one of the most important open problems in the 
behavioral sciences [24]. The examples mentioned raise the 
possibility that a player tries to optimize not only personal but 



her local neighborhood's payoff as well. Motivated by this 
option we consider the simplest case and introduce a collec- 
tive pairwise strategy update rule providing that two randomly 
chosen neighbors upgrade their strategy simultaneously in or- 
der to increase their summarized payoff each coming from 
games with all their neighbors on the spatial system. This way 
of strategy update can be considered as an extension of coop- 
erative games toward the spatial evolutionary games. Origi- 
nally, in cooperative games groups of players (coalitions) may 
perform coordinated behavior within the group to enhance the 
group's payoff. On the other hand, the present model im- 
plies a connection between the theory of kin selection ll25[l26ll 
and spatial models of viscous population of altruistic relatives 
helping each other JzzHUl]- 

As a consequence of the proposed strategy update rule, it 
will be shown that the previously mentioned "tragedy of the 
commons" state can be avoided even in the hard condition 
of Prisoner's Dilemma game. In the latter case both analyti- 
cal and numerical approaches indicate the existence of an or- 
dered structure of cooperator and defector players on square 
lattice at sufficiently low noise level. (This arrangement of 
alternative strategies resembles the sublattice ordering of anti- 
ferromagnetic Ising model.) It is worth mentioning that sim- 
ilar formation of strategies was also reported by Bonabeau et 
al. 113011 and by Weisbuch and Stauffer Bill within the frame- 
work of social models. To explore and identify the exclu- 
sive consequence of the proposed strategy update rule, we 
will compare the results with the outcomes of two previously 
applied dynamical rules. These are the imitation of a better 
neighbor and the so-called myopic strategy update rules. 

The present work is structured as follows. In Sec. HI] we 
define the spatial evolutionary games with the mentioned dy- 
namical rules. The main results for these types of dynamical 
rules are compared for the anti-coordination game in Sec.lHTl 
Subsequently we will discuss the weak Prisoner's Dilemma 
games with using Monte Carlo (MC) simulations and mean- 
field analysis. In Sec. [V] we present and compare the MC re- 
sults for the three dynamical rules within a relevant region of 
payoff parameters describing social dilemmas 113211 . As the 
dynamical rules influence significantly the sublattice ordering 
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process therefore some aspects of domain growth are consid- 
ered numerically in Sec. [VI] Finally we summarize the main 
results in Sec.lVIII 



simultaneously with a probability 



II. MODELS WITH DIFFERENT DYNAMICAL RULES 

In the studied models each player follows one of the pure 
cooperate or defect (C or D) strategies. According to pairwise 
interaction a player's payoff is calculated by means of 2 x 2 
payoff matrix. For a given pair of equivalent players the possi- 
ble strategy dependent payoffs are given by the payoff matrix 



(1) 



where a (d) is received by the C (D) player if her co-player 
follows the same strategy. On the other hand, if the players 
choose opposite strategies the C players receive b while Ds 
are rewarded by c. The anti-coordination (AC) game will be 
considered when a = d = and b = c = 1. In this case the 
players receive the maximum payoff if they choose opposite 
strategies. For the social dilemmas we also use a rescaled 
payoff matrix lf33ll in such a way that a = 1 and d = 0, that is, 
the mutual choice of C is better for both players. Despite it the 
players can favor the choice of D if either c = T > 1 or 6 = 
S < where T refers to the temptation to choose defection 
and S is the sucker's payoff. For the Prisoner's Dilemma (PD) 
both conditions are satisfied and the players are enforced to 
choose D yielding the second lowest individual income for 
them. The Hawk-Dove, in short HD (also called as Snowdrift 
or Chicken) game describes the situation when T > 1 and 
S > while the Stag-Hunt (SH) game corresponds to the case 
T < 1 and S < 0. The fourth quadrant of the T — S parameter 
plane is represented by the Harmony (H) game where mutual 
C is the best solution for the players. In the mentioned four 
quadrants of the T — S plane the two-person one-shoot games 
have different set of Nash equilibria jEH-IEIMl- 

In the present spatial models players are located on the sites 
x of a square lattice consisting of L x L nodes under periodic 
boundary conditions. Initially each player follows an s x = C 
or D strategy chosen at random. The payoff P x is collected 
from the mentioned matrix games with her four nearest neigh- 
bors. According to the proposed pairwise collective strategy 
update the evolution of strategy distribution is based on the 
following protocol. First, we choose two neighboring players 
(x and y) at random and we evaluate their payoff (P x and P y ) 
depending on their own s x , s y , and also on the neighboring 
strategies. Subsequently we evaluate the payoff P x and P' 
assuming that the given players follow randomly chosen s x 
and s' y strategies while the neighborhood remains unchanged. 
As a consequence of randomly chosen s' x , s' y strategy pair, 
there are cases when only one (or none) of the two players 
will modify her strategy. Notice, however, that this strategy 
choice allows the pair of players to select all the possible four 
strategy pairs. Finally the strategy pair, s' x and s' , is accepted 
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l + exp[(P x + Py-PZ,-P')/K\ 



(2) 



where K characterizes the average amplitude of noise disturb- 
ing the players' rational decision. 

The results of the above evolutionary process will be con- 
trasted with the consequence of two other dynamical rules 
used frequently in previous studies JHHt]. If the evolution 
is controlled by stochastic imitation of the more successful 
neighbor then player x adopts the neighboring strategy s y 
with a probability 



Wi = 



l + exp[{P x -P v )/K\ 



(3) 



dependent on the current payoff difference between players x 
and y. Besides it, we also study a so-called myopic strategy 
update when a randomly chosen player x changes her strategy 
s x to a random strategy s' x with a probability 
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l+exp[(P x -P x )/K] 



(4) 



where P x and P' x are the income of player x when playing s x 
and s' x for the given neighborhood. Notice that the latter strat- 
egy update is analogous to the Glauber dynamics used in the 
kinetic Ising models |35ll. Consequently, for symmetric pay- 
off matrices, b = c, (or potential games) the myopic strategy 
update drives the spatial system into a thermal equilibrium (at 
temperature K) that can be described by the Boltzmann statis- 
tics B8l 13611 . This means that an anti-ferromagnetic ordering 
process is expected for the AC games with myopic strategy 
update when decreasing the noise parameter K. 



III. RESULTS FOR ANTI-COORDINATION GAME 

Motivated by the above mentioned connection to the Ising 
model, we first consider the anti-coordination game and study 
the consequences of different strategy update rules. The pre- 
sented results of MC simulations were obtained typically for 
L = 400 size but we used significantly larger system size 
in the vicinity of the critical transitions to suppress undesired 
fluctuations. During the evolution we have determined the av- 
erage portion p of players following the C strategy in the sta- 
tionary state. To describe the expected anti-ferromagnetic or- 
dering the square lattice is divided into two sublattices ( A and 
B) on the analogy of white and black boxes on the chessboard. 
In fact two equivalent types of completely ordered structure 
exist in the limit K 0. For both cases the C and D strate- 
gies are present with the same frequency (p = 1/2). In the 
first (second) case all the C strategies are located on the sites 
of sublattice A (B). The sublattice ordering will be character- 
ized by an order parameter M = \pa~ Pb\ where pa and p b 
denote the portion of C strategy in the sublattices A and B. 
In a finite system the sublattice ordering develops throughout 
a domain growing process within a transient time. 
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Starting with myopic strategy update, defined by Eq. |4] 
the MC data coincide with the exact results of Ising model 
rf37ll if the noise parameter (temperature in the latter case) is 
rescaled by a factor of 2. Accordingly, a long-range ordered 
state appears in the zero noise limit where cooperator and de- 
fector players form a chessboard-like pattern. The order pa- 
rameter M varies from 1 to if K is increased from to 
K c = 1/ ln(l + V2) and M = if K > K c , as illustrated 
in Fig.Q] When considering the analogy between the kinetic 
Ising spin systems and evolutionary AC games one should 
keep in mind that the Glauber dynamics [35] favors spin flips 
decreasing the total energy and the opposite flips are generated 
by the external noise (temperature). On the contrary, for the 
evolutionary games the myopic individuals wish to increase 
their own payoff and the opposite decision is caused by noisy 
effects. The necessity of the temperature rescaling is related 
to the fact that in the kinetic Ising model for Glauber dynam- 
ics the individual spin flips are controlled by the total energy 
difference while for the evolutionary games the changes are 
influenced by the individual payoff increase (AP X ) that is half 
of the total payoff increase (if the payoff matrix is symmetric) 
because the co-players share the income equally. 

In the following we study the evolutionary AC game with 
pairwise collective strategy update rule defined by Eq. @. In 
agreement with our expectation this strategy update is capable 
to find the optimal global state and the long-range sublattice 
ordering is established again when varying the amplitude of 
noise. Figure Q] indicates a striking qualitative similarity be- 
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FIG. 1 : (color online) Order parameter M as a function of K in the 
anti-coordination games for myopic (green open squares) and pair- 
wise collective strategy updates (blue closed squares). The solid line 
shows the theoretical result obtained by Onsager [37] for the two- 
dimensional Ising model. 

tween the behaviors controlled by the myopic and pairwise 
collective strategy updates. The slight difference between the 
outputs is related to the fact how payoffs change due to differ- 
ent update processes. As we already mentioned, the change of 
individual payoff is half of the global change for myopic rule. 
This is not necessarily true for pairwise collective rule because 
neighboring players receive nothing from the game played be- 
tween the focal x and y players. In a special case, when focal 



players adopt each others strategies simultaneously, the exact 
relation between the changes of individual and global payoffs 
is restored. This special process, when s' x = s y and s' y = s x , 
resembles the Kawasaki spin-exchange of statistical mechan- 
ics. As a consequence, this "limited" pairwise collective dy- 
namics reproduces the results of myopic update. 

To close this section, we should stress that the application 
of the third type pairwise imitation strategy update is unable to 
find the optimal solution when cooperator and defector play- 
ers form a long-range ordered state. In the following we will 
study the consequences of different updating rules when the 
payoff matrix is not symmetric. 



IV. RESULTS FOR WEAK PRISONER'S DILEMMA 

We start with a simple and popular parametrization of the 
weak PD game when [S = —0). If we fix the noise level, 
the only free parameter is the T temptation to defect payoff 
element. In the rest of the paper we use K = 0.25 noise 
value, that is proved to offer an almost optimal cooperation 
level in case of pairwise imitation strategy update model I38I1 . 



A. Mean-field theory 



The pairwise imitation strategy update rule was already in- 
vestigated by means of mean-field theory (for a brief survey 
see |5j,|8[] and further references therein). Within this approach 
the stationary state is characterized by the average fraction p 
of cooperators that drops suddenly (at T = 1) from 1 to if T 
is increased for arbitrary values of K. For completeness, we 
note that the results for arbitrary values of S are surveyed in a 
recent review ll39ll using replicator dynamics in the imitation 
of the better strategies, too. 

As expected, the application of the other two strategy up- 
date rules, such as myopic and pairwise collective, may allow 
the possibility of sublattice ordering. To catch this behavior 
we extend the mean-field analysis and introduce two sublat- 
tices (A and B) where the fraction of cooperators can be dif- 
ferent (pa and ps, respectively). Using the general payoff 
parameters given by Eq. ([TJ the average payoff of cooperator 
and defector players in the sublattices A and B can be approx- 
imated as 
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for the present connectivity structure where each player in 
sublattice A has four neighboring players belonging to sub- 
lattice B and vice versa. 

For the myopic strategy update [defined by Eq. ©J the time 
derivative of cooperator frequency pa can be expressed as: 
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and a similar expression can be derived for ps by substituting 
B for the sublattice index A. Inserting the expressions © into 
(O after some mathematical manipulations one finds 
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The stationary values of cooperator densities pa and p b can 
be determined numerically from Eq. (0 when pa = Pb = 0. 
For the case of the weak PD (a = 1, b = S = 0, c = T, 
and <i = 0) the T-dependence of the stationary solution ex- 
hibits two types of behaviors as illustrated by green lines in 
the upper plot of Fig. [2] Below a threshold value (depending 
on K) the distribution of cooperators is homogeneous, that is, 
PA = Pb- Above the threshold temptation value the mean- 
field solution predicts a (twofold degenerated) sublattice or- 
dering. For one of the ordered structure (at sufficiently high 
values of T) sublattice A is occupied dominantly by defec- 
tors and in sublattice B the players alternate their strategies 
between cooperation and defection (i.e., ps — 0.5) because 
both strategies yield the same payoff for them (notice that 

P B C ^ = Pg 3 ^ = if pa — 0). For the second (equivalent) or- 
dered structure the role of sublattices A and B is exchanged. 
We should mention that there is an unstable symmetric solu- 
tion without sublattice ordering (pa = Pb) for high T values 
as illustrated by spaced dashed green line in the upper plot of 
Fig.0 

The mean-field analysis can also be performed for the pair- 
wise collective strategy update although the corresponding 
formulae become more complicated because of the larger 
number of elementary events (single and simultaneous two- 
site strategy flips for two neighboring players). Neglecting the 
technical details now we only present the numerical results by 
blue lines in the upper plot of Fig. [2] Similarly to the my- 
opic case, the stable solution is a homogeneous state for low 
T values and a sublattice ordering appears above a threshold 
temptation. In the latter case, one of the sublattices is occu- 
pied dominantly by cooperators while the other sublattice is 
occupied by defectors. Naturally, this solution is twofold de- 
generate if we substitute the role of sublattices A and B. An 
unstable symmetric solution also exists at high T values that is 
marked by spaced dotted blue line in the upper plot of Fig. [2] 

In the following we will check the predictions of mean-field 
theory by using MC simulations. 



B. Simulations 

Similarly to the AC model, the typical system size was 
L = 400 during the simulations. In the vicinity of phase 
transition point, however, we had to use larger systems (up 
to L = 2000) to gain the sufficient accuracy. The results for 



different strategy update are summarized in the lower panel of 
Fig. [2] For completeness, we quote here the known MC results 
of pairwise imitation update 113811 . These results are marked by 
red open circles in the plot. In contrast to mean-field predic- 
tion, C and D strategies can coexist if T c \ < T < T C 2 (where 
T cl = 0.942(1) and T c2 = 1.074(1) for K = 0.25). For 
lower values of T only the C strategy can remain alive in the 
final stationary state while C becomes extinct if T > T C 2- 
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FIG. 2: (color online) Density of cooperators in the sublattices as a 
function of T for the weak PD game on a square lattice at K = 0.25. 
The upper plot shows solutions predicted by mean-field theory. Solid 
(red), dashed (green), and dotted (blue) lines illustrate the stable so- 
lutions for pairwise imitation, myopic, and pairwise collective strat- 
egy update rules, respectively. Unstable solutions of the mean-field 
equations at high T values are denoted by spaced dashed and dotted 
lines for the last two dynamical rules. Lower plot shows the corre- 
sponding MC results. Using the same color coding, red open circles 
are for pairwise imitation, green open squares for myopic, and blue 
closed squares are for pairwise collective rules. 

In case of myopic strategy update, qualitatively similar be- 
havior was found (marked by green open squares in the lower 
plot of Fig. |2]). The only difference is a relatively high portion 
of cooperator players in the high T region. The survival of 
the C strategy is caused by appearance of solitary Cs in the 
sea of Ds because this event does not modify the payoff of 
the given individual if S = 0. Evidently, the probability of 
the mentioned process decreases if S becomes negative, as it 
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happens in the real PD situations. 

In contrast to the prediction of mean-field calculation the 
MC simulations do not justify the presence of long-range 
ordering. Instead of it the simulations give pa = Pb — 
0.2265(2) if T is sufficiently high. It is worth mentioning 
that the more sophisticated pair approximation (the results are 
not indicated in the figure) reproduces the absence of sublat- 
tice ordering in the case of S = 0. For slightly higher values 
of S (in the region of HD game), however, the prediction of 
mean-field calculation becomes qualitatively correct because 
MC simulations indicate sublattice ordering as detailed later 
on. 

Closing by pairwise collective update, MC data (blue 
closed squares) fully support the prediction of mean-field cal- 
culations and show a similar sublattice ordering as we ob- 
served for AC game previously. Namely, the portion of 
C strategy is distinguishable in the sublattices A and B, if 
T > T C (K = 0.25) = 1.41(1). Notice furthermore, that the 
average density of Cs is significantly higher for this strategy 
update in comparison with those provided by the pairwise im- 
itation and myopic rules. 

The pairwise collective strategy update promotes the 
chessboard-like arrangement of cooperators and defectors be- 
cause this constellation provides the highest total income for 
the neighboring co-players (as well as for the whole commu- 
nity) if T + S > 2R, where 2R is the total payoff of a co- 
operator pair. This feature can be clearly recognized for the 
weak PD game where the ordered structure is disturbed rarely 
by point defects if T > 2 and the average payoff increases lin- 
early with T as illustrated in Fig. [3] It is well known that the 
traditional game theory 19[] suggests the players to alternate C 
and D in opposite phase to receive (T + S)/2 on average for 
the repeated two-person games. The sublattice ordering in the 
spatial evolutionary game can be considered as an alternative 
solution to achieve the maximum average payoff. 
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FIG. 3: (color online) Average payoff versus T for three dynamical 
rules at K = 0.25 in the case of weak PD. The MC results are illus- 
trated by red open circles for pairwise imitation, green open squares 
for myopic and blue closed squares for pairwise collective strategy 
adoption rule. The dashed line denotes the maximum average payoff 
available in the system. 



From the view point of the pair of players (or the whole 
society) the chessboard-like arrangement of cooperation and 
defection has an advantage over the state of the homogeneous 
defection favored by several dynamical rules within a wide 
payoff region of PD. As a result, the ordered structure is ob- 
servable in the spatial strategy distribution even for T < 2 
for the weak PD. Below a threshold value of T, however, the 
long-range ordered strategy distribution is destroyed by the 
noise and both types of ordered structures are present within 
small domains. Simultaneously, the further decrease of T in- 
creases the frequency of cooperators approaching to the sat- 
uration value (0 = 1. The quantitative analysis of the pair's 
payoff (P x + P y ) shows that the homogeneous cooperation re- 
mains stable against the appearance of a solitary D if T < 5/4 
for the case of weak PD. Similarly, in the perfect sublattice 
ordered state (pa = 1 and ps = 0) the appearance of an ad- 
ditional cooperator is preferred if T < 5/4. At the same time 
the homogeneous defection is not stable because any new co- 
operator increases the income of her neighbors (and also the 
income of pairs she belongs to). From the above features one 
can conclude a sharp transition at T = 5/4 from the homoge- 
neous C state (pa — Pb — 1) to one of the chessboard-like 
structure (e.g., pa = 1 and ps = 0) if T is increased in the 
limit K — > 0. This expectation is confirmed by MC simula- 
tions performed for several low values of noise K. 

In the following we extend the payoff parametrization to 
general social dilemmas and explore how robust the observed 
long-range ordering in the whole T — S plane. 



V. RESULTS FOR SOCIAL DILEMMAS 

To reveal the possible sublattice ordering we carried out se- 
ries of MC simulations for different S values by using the 
same K = 0.25 noise level. The results are summarized by 
consecutive curves in Fig. fallowing us to compare the ter- 
ritories of T — S parameters where cooperators, defectors, 
or sublattice ordering prevail the stationary state for the three 
dynamical rules we studied. It can be clearly seen, for exam- 
ple, that the defectors dominated area of T — S region shrank 
significantly if the evolution is controlled by pairwise collec- 
tive strategy update. For this dynamical rule players favor to 
choose cooperation in the sea of defectors if T + 45 > 0. 
This is the reason why the fraction of cooperators is suffi- 
ciently high in the case of weak PD. Notice furthermore that 
the pairwise collective strategy update provides the best con- 
dition for the cooperators to prevail the system within the 
region of SH game. Referring to the previously discussed 
connection between AC model and anti-ferromagnetic Ising 
model, SH game allows comparison with ferromagnetic or- 
dering. The application of pairwise collective update reveals 
this possibility and highly extend the C-dominated phase in 
the SH quadrant. 

The upper (a) plot of Fig. [4] illustrates that sublattice order- 
ing occurs over a large region of T — S payoff parameters 
within the territory of HD and PD games if the evolution is 
controlled by pairwise collective strategy update. The crit- 
ical temptation parameter T C (K) varies monotonously with 
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FIG. 4: (color online) Fraction p of cooperators in the two sublattices 
of the square lattice as a function of T and S for a fixed noise level 
(K = 0.25). Plots from top to bottom illustrate the MC results when 
the evolution is controlled by pairwise collective (a), myopic (b), and 
pairwise imitative strategy updates (c). Dotted (blue) and dashed 
(green) lines denote distinguishable ps in the sublattices A and B 
while the solid (red) lines indicate ps in the absence of this symmetry 
breaking. The four quadrants of the T—S plane correspond to games 
denoted by their abbreviation, namely, H="Harmony", HD="Hawk- 
Dove", SH="Stag Hunt", and PD="Prisoner's Dilemma". Thicker 
lines indicate data illustrated in the lower plot of Fig. [2] 



the sucker's payoff S. 

For myopic strategy update the sublattice ordering can be 
observed only within the HD region of the payoff parameters 
as shown in the middle (b) plot of Fig. [4] It is emphasized that 
this behavior is predicted by the mean-field calculation for the 
case of weak PD, that is at the boundary separating the terri- 



tories of HD and PD (see the lower plot of Fig. Comparing 
the p(T, S) surfaces of Fig.|4^&b, it is worth noting that my- 
opic update does not support cooperation in the SH quadrant 
as effectively as it is done by pairwise collective update. 

Finally, Fig. 2J; indicates clearly that the imitation of the 
nearest neighbors on a square lattice does not support the 
emergence chessboard-like sublattice ordering. On the other 
hand, the upper and lower plateaus (at pa = Pb = and 
1) represent absorbing states and the continuous transitions to 
these homogeneous states belong to the directed percolation 
universality class I140L kill . Within the Stag Hunt region the 
system exhibits a first-order phase transition for all the three 
rules. 

Evidently, the sublattice ordering is prevented if the noise 
level K becomes sufficiently high for any values of T and S. 
Simultaneously, the region of disordered coexistence of C and 
D strategies (where e.g., 0.01 < pa and ps < 0.99) increases 
with K. 



VI. DOMAIN GROWTH FOR SUBLATTICE ORDERING 

The above simulations justified the appearance sublattice 
ordering for two of three dynamical rules we studied. It is 
emphasized, however, that many versions of similar spatial so- 
cial dilemmas were investigated previously without reporting 
long-range ordering. Most of these investigations are based on 
imitation Il42l - l45ll preventing the formation of this type of or- 
dered strategy distribution as mentioned above. The sublattice 
ordering is observable within small domains in the snapshots 
published previously by several authors 11461 14711 . Sysi-Aho et 
al. studied HD game with myopic agents on a square lattice 
with first- and second-neighbor interactions j46Tl . In the latter 
model the randomly chosen agents are allowed to modify their 
strategy only if this action increases their own payoff (in con- 
trary to the present stochastic myopic rule (Eq. [4]i allowing 
players to use of unfavored strategy with a low probability). 
Similar (short-range ordered) patterns were reported by Wang 
et al. [47] who used a more complicated synchronized strat- 
egy update. 

In order to clarify the importance of possible disadvanta- 
geous strategy change in the framework of myopic update, we 
compare the domain growth processes for two different cases. 
In the first case the evolution is controlled by the myopic rule 
as defined by Eq. [4] In the second case we used the same 
strategy adoption probability (0]i but only if the new strategy 
increases the player's income [P x < P^.] otherwise the adop- 
tion of the new strategy s' x is forbidden. In other words, the 
second rule can be considered as a restriction of the first one 
when strategy change with payoff increment is allowed only. 
The visualization of the time-dependent strategy distribution 
indicated clearly that after a short ordering process the evo- 
lution is ended in a frozen pattern in the (restricted) second 
case if the simulation is started from a random initial state 
within the region of HD game. The difference between the 
time evolutions can be demonstrated if we compare the time- 
dependence of average payoffs U(t) for both cases. As Fig- 
ure [5] shows, the evolution of U(t) stops very early (at about 
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FIG. 5: The upper and lower solid lines illustrate the time depen- 
dence of the average payoff for the myopic (for K = 0.25) and pair- 
wise collective strategy adoption rules (for K = 0.25) at T = 1.5, 
S = 0.5, and L = 4000. The dashed lines denote the time depen- 
dence when the adoption of disadvantageous strategies is forbidden 
for the same parameters. 

Regarding the restricted feature of the second update rule, 
a frozen pattern of strategy distribution can be considered as 
a state where every player is satisfied with her own strategy. 
In other words, a frozen state is analogous to a Nash equilib- 
rium in the sense that the unilateral deviation from this strat- 
egy profile would reduce the income of the given player. As 
we demonstrated, a frozen state, that is the composition of two 
types of (equivalent) small ordered domains, can be avoided 
by applying the first rule because irrational strategy change 
along the interfaces helps to find global optimum. 

The above discussed phenomenon has inspired us to inves- 
tigate what happens for the pairwise collective strategy update 
(defined by Eqj2j if the disadvantageous strategy adoptions 
are prohibited as earlier for myopic update. Surprisingly, the 
restriction of pairwise collective update does not block the do- 
main growth process. The visualization of the evolution of 
strategy distribution indicates that strategies are not changed 
in the bulk of ordered domains but vary exclusively along the 
boundaries separating the ordered regions. Consequently, as 
Fig.[5]illustrates, U(t) can increase continuously until reach- 
ing the value U s — 4 whether the strategy change is restricted 
to payoff increment cases or not. 

During the domain growth process the deficiency of aver- 
age payoff is located along the boundaries separating the or- 
dered regions therefore U s — U (t) decreases proportionally 
to the total length of interfaces. This behavior resembles to 
the domain growth of solid-state physics systems where the 
growth kinetics is driven by reducing interfacial energy. In the 
latter, so-called curvature-driven growth the excess domain- 
boundary energy decays algebraically with n = 1/2 exponent 
Ir48tl . The time evolution of payoff difference in Fig. [6] sup- 
ports our argument and shows algebraic decay with the men- 
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FIG. 6: Log-log plot of U s — U (t) versus time t for (full) myopic, 
restricted, and full pairwise collective strategy adoption rules (from 
top to bottom). The MC data and parameters are the same plotted 
in Fig. [5] The dashed line denotes the slope (—1/2) of theoretical 
prediction. 

Figure [6] illustrates that the deficiency of average payoffs 
vanishes algebraically, that is, U s — U(t) ~ T/y/i for all 
the three cases exhibiting growing domains. The speed (pre- 
factor T) of ordering decreases fast with the noise level K 
because the disadvantageous strategy updates become rarer 
and rarer for sufficiently low values of noise. In the oppo- 
site case, when K approaches its critical value K c dependent 
on the payoff parameters (ordering exists only if K < K c ), 
the domain growth also slows down due to the diverging fluc- 
tuations. In the quantitative comparison of domain growth for 
different dynamical rules the mentioned effects are compen- 
sated by doubling the value of K for the pairwise collective 
strategy update plotted in Figs. [5] and [6] Surprisingly, for 
the latter dynamical rules the blocking of the disadvantageous 
strategy adoptions makes the domain growth faster (this phe- 
nomenon might have been related to the reduced noise effects 
along the interfaces). 

Finally we mention that the preliminary simulations have 
confirmed the appearance of sublattice ordering for the mod- 
els with nearest- and next-nearest interactions. More pre- 
cisely, in the latter case one can observe a four-sublattice 
ordering process with many different types of ordered struc- 
tures resembling those periodic structures described by a two- 
dimensional Ising model with first- and second-neighbor in- 
teractions 1 49, 50]. The poly-domain versions of most of these 
ordered structures were reported previously by several authors 
who studied the evolutionary HD games with a myopic strat- 
egy update when prohibiting the acceptance of those strategies 
yielding a lower individual payoff 11451 14611 . We have checked 
what happens if the present myopic evolutionary rule [given 
by Eq. ©] is applied only if {P x - P^) < 0. In the latter 
case the domain growing process is also blocked in a (frozen) 
poly-domain state as it is described above. 
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VII. CONCLUSIONS AND OUTLOOK 

In this work we have introduced a pairwise collective strat- 
egy update and studied its impact on anti-coordination and 
social dilemma games. Our proposal was motivated by real- 
life experiences when players act to increase not only personal 
but their local neighborhoods' payoff, too. As a starting effort 
along this avenue we have chosen pairs of players. To identify 
and quantify the consequence of this strategy update, we have 
also studied two frequently applied strategy updates, such as 
myopic and pairwise imitation. 

Our results highlight the emergence of a spatially ordered 
distribution of strategies on the analogy to anti-ferromagnetic 
ordering in spin systems. In contrary to the traditional imita- 
tions both the myopic and pairwise collective strategy updates 
support the formation of ordered strategy distribution favored 
within a wide range of social dilemmas. This ordered arrange- 
ment of cooperators and defectors can provide the maximum 
total payoff in a wide range of payoff parameters for the so- 
cial dilemmas and it seems to be a general behavior on regu- 
lar networks characterizing connections between the players. 
Within the context of social sciences the appearance of men- 
tioned state can be interpreted as a possibility for the com- 
munity to avoid the "tragedy of the common" (when all the 
agents choose D and receive nothing) by sharing the two pos- 
sible roles (strategies) with a spatially ordered (rigid) struc- 



ture. The systematic comparison of the level of cooperations 
(p) for the three different evolutionary rules has justified that 
the pairwise collective strategy update provides the highest to- 
tal (average) income for the whole spatial community in most 
of the region of payoff parameters. 

We have also studied the kinetics of ordering and found fur- 
ther similarities with physics motivated systems. This investi- 
gation highlighted the importance of irrational decisions as a 
way to avoid trapped (frozen) states. 

There are two main directions to extend the present work. 
Firstly, as we already noted, other type of interaction graphs 
can also be considered. Our preliminary results confirm that 
the positive impact of the pairwise collective strategy update 
is not restricted to square lattice with nearest neighbor interac- 
tion. A similar ordering can emerge locally for a wide range of 
interaction graphs. Secondly, the size of the group in which 
players favor the group interest (instead of personal payoff) 
can be also increased. In this case further improvement of 
cooperation is expected. 
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